Exploratory analysis: what to do first.
نویسنده
چکیده
One of the more important but often overlooked parts of statistical analysis is the very first step—an exploratory and descriptive analysis. Typically, researchers take a quick look at the data and then dive into more complex regression models or t tests. In this column, I discuss preliminary analysis in general and look at some techniques less well known than others, but which provide interesting and useful results. The first step in understanding your data is to establish the kinds of variables you have. Are they continuous (ranging over several values, like weight or height) or categorical (taking only a few values)? Are the continuous variables bounded (like age, which can’t be less than zero) or unbounded? Are there any outliers or strange values? This last question can be looked at in a simple way. Calculate the mean and standard deviation of a variable, and examine values that are more than three, or if you want to be very careful, two, standard deviations from the mean. If there are outliers, they need to be investigated, and either eliminated (if they are errors) or treated carefully (if they are valid data points). Next, for the continuous variables, look at histograms of your data, and for the categorical variables, look at frequency tables. These will tell you roughly what the distributions of the variables are and this influences the statistics you can use. The next thing to consider is bivariate analyses of the data. First, what do we do with continuous variables? A common mistake is to examine correlations first. But these are usually an ineYcient way of inspecting the data, because significant correlations depend on a linear relationship between the variables and if the true relationship is curved, the correlation may not indicate the association. Another approach is to graph a scatterplot of the two variables and check for a relationship. For categorical variables, it is easiest to inspect bivariate (for example 2 × 2) cross tabulations to identify patterns and potentially interesting relationships. These relationships provide the baseline for futher analyses. Finally, a multivariate exploratory analysis may be needed to detect possible confounding (the mixing of eVects of an outcome, an exposure and a third variable that is associated with the primary predictor and also aVects the outcome) or eVect modification (when the eVect of an exposure on the outcome diVers for diVerent levels of a third variable). The easiest way to do this is with a bivariate analysis stratified by the third variable. If the latter is categorical just look at the relationship between the other two variables restricted to the levels of the third, and if it is continuous, create a new categorical variable. If there is important confounding or eVect modification (the definition of “important” here is arbitrary and depends on the needs of the analysis) these must be accounted for in the formal models when computing estimates of the primary predictor. After these preliminary analyses, the patterns and relationships in the results should be reasonably clear and the analyses that need to be done should be obvious. If this is the case, then the rest is simple—for continuous variables, t tests, ANOVA, or linear regression can be used to confirm the exploratory work. Similarly, for categorical data, ÷ or non-parametric tests can be used. If patterns in the results are not clear, two things are possible: either there aren’t any interesting relationships, or there are but they are complex and you need to consult a statistician! Injury Prevention 1998;4:140 140
منابع مشابه
Manifestations of Economic Resilience in Hospitals: What do managers say?
Introduction: The area of healthcare requires crucial measures in the area of economic resilience. Regarding the important role of hospital managers in reaching the healthcare goals and realizing instances of the economic resilience in hospitals, the present study was conducted. The aim was to explore the perception of educational and healthcare centers’ managers in Isfahan regarding the concep...
متن کاملTranslation Technology Tools and Professional Translators’ Attitudes toward Them
Today technology is an integral part of professional translation; and it is generally assumed that translators’ attitudes toward translation technology tools influence their interaction with technology (Bundgaard, 2017). Therefore, the present two-phase study seeks to shed some light on what translation technology tools are and how professional translators feel toward them. The research method ...
متن کاملAn Exploratory Study of the Role of Key Users in ERP Implementations
This research proposal presents a research design on the role of key users in ERP implementations. We intend to investigate the following research questions: (1) How do key users function in an ERP implementation project? (2) What factors influence key users’ functioning? And how do the factors influence key users’ functioning? (3) What are the benefits and impacts of key users? In this proposa...
متن کاملEvaluation of an evaluation
Introduction. Evaluation is a systematic way to improve and make more effective actions that involves procedures which are useful, feasible, ethical, and accurate. Common questions in all evaluations are: do all part of program do well and effective? What is the good functioning? Why the program or its parts do not work well? What are the effects and consequences of the program? Is this progra...
متن کاملDiagnostic and therapeutic challenges for dermatologists: What shall we do when we don’t know what to do?
What shall we do when we have done everything we could for the diagnosis and treatment of a patient, but were not successful? What shall we do when there is no definite treatment for a patient? What shall we do when we have no diagnosis or treatment for a patient? Some useful suggestions are presented here to get rid of these situations.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Injury prevention : journal of the International Society for Child and Adolescent Injury Prevention
دوره 4 2 شماره
صفحات -
تاریخ انتشار 1998